Skip to content

Conversation

@muskan-agarwal26
Copy link
Contributor

Proposed commit message

The initial release includes unified_log data stream and associated dashboard.

macOS fields are mapped to their corresponding ECS fields where possible.

Test samples were derived from live data samples.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • I have verified that any added dashboard complies with Kibana's Dashboard good practices

How to test this PR locally

To test the macOS package:

  • Clone integrations repo.
  • Install elastic package locally.
  • Start elastic stack using elastic-package.
  • Move to integrations/packages/macos directory.
  • Run the following command to run tests.

elastic-package test

Run asset tests for the package
2025/10/29 15:10:56  INFO License text found in "/root/GITHUB/integrations/LICENSE.txt" will be included in package
--- Test results for package: macos - START ---
╭─────────┬─────────────┬───────────┬────────────────────────────────────────────────────────────────┬────────┬──────────────╮
│ PACKAGE │ DATA STREAM │ TEST TYPE │ TEST NAME                                                      │ RESULT │ TIME ELAPSED │
├─────────┼─────────────┼───────────┼────────────────────────────────────────────────────────────────┼────────┼──────────────┤
│ macos   │             │ asset     │ dashboard macos-4b49d421-2f03-4dd2-891f-cbd7e2786e35 is loaded │ PASS   │      1.622µs │
│ macos   │             │ asset     │ dashboard macos-4fae07f9-fff4-49d0-8ed6-54a63b4c6426 is loaded │ PASS   │        363ns │
│ macos   │ unified_log │ asset     │ index_template logs-macos.unified_log is loaded                │ PASS   │        739ns │
│ macos   │ unified_log │ asset     │ ingest_pipeline logs-macos.unified_log-0.1.0 is loaded         │ PASS   │        213ns │
╰─────────┴─────────────┴───────────┴────────────────────────────────────────────────────────────────┴────────┴──────────────╯
--- Test results for package: macos - END   ---
Done
Run pipeline tests for the package
--- Test results for package: macos - START ---
╭─────────┬─────────────┬───────────┬─────────────────────────────────────────────┬────────┬──────────────╮
│ PACKAGE │ DATA STREAM │ TEST TYPE │ TEST NAME                                   │ RESULT │ TIME ELAPSED │
├─────────┼─────────────┼───────────┼─────────────────────────────────────────────┼────────┼──────────────┤
│ macos   │ unified_log │ pipeline  │ (ingest pipeline warnings test-unified.log) │ PASS   │ 708.724132ms │
│ macos   │ unified_log │ pipeline  │ test-unified.log                            │ PASS   │ 964.410061ms │
╰─────────┴─────────────┴───────────┴─────────────────────────────────────────────┴────────┴──────────────╯
--- Test results for package: macos - END   ---
Done
Run policy tests for the package
--- Test results for package: macos - START ---
No test results
--- Test results for package: macos - END   ---
Done
Run static tests for the package
--- Test results for package: macos - START ---
No test results
--- Test results for package: macos - END   ---
Done
Run system tests for the package
--- Test results for package: macos - START ---
No test results
--- Test results for package: macos - END   ---
Done

Related issues

Screenshots

macos-1 macos-2

@muskan-agarwal26 muskan-agarwal26 requested a review from a team as a code owner October 29, 2025 09:44
@andrewkroh andrewkroh added documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. Crest Contributions from Crest developement team. New Integration Issue or pull request for creating a new integration package. dashboard Relates to a Kibana dashboard bug, enhancement, or modification. labels Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start, but I don't think it's complete enough or shows the right metrics for a security analyst -

The total request/response/network bytes aren't that useful without time context or comparative baselines.

Same feedback for "Events by response status" and "Events by privacy status" - these are basically saying "everything worked" but don't give actionable info.

Also, re the top source IP datatable - it doesn't show which processes are communicating, destination IPs, protocols/ports or time of activity which are more relevant...

Can we try editing some of this @muskan-agarwal26 @piyush-elastic - can you let me know if any of these visualizations are possible based on the information available in the logs?

  • "Network connections over time" (maybe an area chart?) showing connections/minute, color-coded by new/established/closed
  • "Active network connections" table showing process, local port, remote IP, state, duration
  • "Top external destinations" table showing domains/IPs and # of connections

cc @jamiehynds

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, @cpascale43 , I’ll follow your suggestions and make the necessary changes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar feedback to the Network dashboard - can we make sure this has more security context?

The biggest enhancements would be an events timeline, and a more security-specific events breakdown table.

The pie charts up top aren't very meaningful on their own, so I think we could replace them with an area chart showing aggregated security events over time. Is there a way a user could see key events represented on the top and when they occurred, like

  • "3 failed authentication attempts at 12:23"
  • "New process launched from /tmp at 9:15"

Also, the "Events by Subsystem" and "Events by Category" bar charts are a bit strange - I think it would be more useful to bucket the events into categories like "Authentication", "Process", "Network", "File system" etc, in line with the categories outlined in the issue.

Can we make a "Security Event Types" table showing a breakdown of the number of events per category? Something like

Authentication Events
---------
- Successful logins | 45
- Failed logins | 3
- Privilege escalation | 12 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpascale43,
As suggested, we will remove the following visuals:
All pie charts and bar charts

And add the following visual:

Aggregated Security Events Over Time – This will display the breakdown of each category over time. It will be created using a custom field derived within the pipeline by bifurcating events based on predicates.

We couldn’t create the Security Event Types visual as suggested, since we’re unable to break down subcategories within categories. For example, under the authentication category, subcategories like “successful logins” or “failed logins” aren’t available, as such details are not present in the logs.

/packages/lumos @elastic/security-service-integrations
/packages/lyve_cloud @elastic/security-service-integrations
/packages/m365_defender @elastic/security-service-integrations
/packages/macos @elastic/security-service-integrations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be owned by @nfritts's team, the input is owned by their team and also system unified logs. Any reason we are linked here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch @narph agreed, + @marc-gr as he has been assisting with development here

@@ -0,0 +1,3 @@
dependencies:
ecs:
reference: git@v8.17.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably want to use 9.2.0 since is the latest

@@ -0,0 +1,80 @@
predicate:
{{#if authentication}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want this or multiple data streams instead? are there any advantages to this approach vs the other?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the logs are coming from a single source and only differ by filters such as authentication, user, or network, a single data stream is sufficient. We usually create separate streams only when data originates from different endpoints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have many cases were datastreams are more of a logical separation, eg mimecast, google_workspace, postgresql, etc., in this case I think it could have some benefits by simplifying the logic in the pipelines by quite a lot, since we will always know the event types, and we can reduce logic to identify them. Would make for cleaner pipelines, easier to maintain in the future, and less prone to breaking if anything changes. Is this something we could consider?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a feedback, even if the data comes from a single source, if the context/dataset is different like authentication, user and network, they should be stored in different data streams because they may have different requirements for retention, a different volumetry and also require different custom processing.

Having all the data on a single data stream already cause some issues for other integrations like Fortigate and Palo Alto, since this integration can generate an insane amount of logs, this would cause the same problems where the user may want to store some events for a longer time, but is unable because it is all in the same data stream.

For example, in our cause we would have a different retention for authentication and user events and another for network, this would not be possible with a single data stream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed @leandrojmp , we are working on separating the events into different data streams, soon will update the PR.

1. Update ecs version to 9.2.0
2. Update codeowner entry.
title: Collect unified logs from macOS
description: Collecting unified logs from macOS.
owner:
github: elastic/security-service-integrations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, the owner should be the @elastic/sec-windows-platform team, @marc-gr is @elastic/sec-linux-platform taking over this one?

/packages/lumos @elastic/security-service-integrations
/packages/lyve_cloud @elastic/security-service-integrations
/packages/m365_defender @elastic/security-service-integrations
/packages/macos @elastic/sec-linux-platform
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be @elastic/sec-windows-platform

1. Change owner in manifest and codeowners
Copy link
Contributor

@marc-gr marc-gr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some initial comments

@@ -0,0 +1,312 @@
---
description: Pipeline for processing authentication logs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: Pipeline for processing authentication logs.
description: Pipeline for processing common fields

"kind": "event",
"original": "{\"timezoneName\":\"\",\"messageType\":\"Error\",\"eventType\":\"logEvent\",\"source\":null,\"formatString\":\"rejecting write of key(s) %{public}s in { %{public}s, %{public}s, %{public}s, %{public}s, managed: %d } from process %{public}d (%{public}s) because %{public}s\",\"userID\":502,\"activityIdentifier\":0,\"subsystem\":\"com.apple.defaults\",\"category\":\"cfprefsd\",\"threadID\":273730,\"senderImageUUID\":\"FEDAF68C-F484-3FCA-8866-A9E7E46CE7B6\",\"backtrace\":{\"frames\":[{\"imageOffset\":1818634,\"imageUUID\":\"FEDAF68C-F484-3FCA-8866-A9E7E46CE7B6\"}]},\"bootUUID\":\"218031E6-E47F-4A77-B7FC-5A57B049F4BC\",\"processImagePath\":\"\\/usr\\/sbin\\/cfprefsd\",\"senderImagePath\":\"\\/System\\/Library\\/Frameworks\\/CoreFoundation.framework\\/Versions\\/A\\/CoreFoundation\",\"timestamp\":\"2025-10-01 18:19:11.945508+0530\",\"machTimestamp\":454777460134003,\"eventMessage\":\"rejecting write of key(s) CKStartupTime in { secd, test, kCFPreferencesAnyHost, \\/Users\\/test\\/Library\\/Preferences\\/secd.plist, managed: 0 } from process 4954 (secd) because setting these preferences requires user-preference-write or file-write-data sandbox access\",\"processImageUUID\":\"04C516B8-C8E5-30EF-AC49-1631528F5645\",\"traceID\":35866893965594628,\"processID\":4944,\"senderProgramCounter\":1818634,\"parentActivityIdentifier\":0}",
"type": [
"info"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt this match the level? in this case error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing event.type from advanced_monitoring pipeline, as we haven't mapped event.category.
So matching isn't needed here.

Comment on lines +1 to +4
predicate:
- 'eventMessage CONTAINS[c] "exec" OR eventMessage CONTAINS[c] "fork" OR eventMessage CONTAINS[c] "exited" OR eventMessage CONTAINS[c] "terminated"'
- 'subsystem == "com.apple.securityd" AND (composedMessage CONTAINS "code signing" OR composedMessage CONTAINS "not valid")'
- 'composedMessage CONTAINS "com.apple.quarantine"'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could beinteresting to have this as the default value, and let the user override them instead of making them hardcoded here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have added option for user to add their predicates as well.
Do you mean to remove the hardcoded ones?

"id": "248"
},
"log": {
"level": "Default"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would make sense to standardize this value

"user": [
"501",
"248",
"\"Setup User\"",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to sanitize this and remove the quotes

Copy link
Member

@P1llus P1llus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Focusing mostly on the data and ingest pipelines adding my two cents, feel free to ignore them if needed.

@@ -0,0 +1,50 @@
predicate:
- 'process contains "sudo" OR composedMessage CONTAINS "sudo" OR process contains "su"'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the same predicate as the authentication datastream, how does that work? Does it ingest the data twice?

I can see why splitting it up into different datastreams was discussed earlier and I agree as it would be too big otherwise, but if they have the same predicate then maybe it wouldn't be as useful?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing it from user_and_account_management, it should be a part of authentication, thanks.

- set:
field: process.pid
tag: set_process_pid_from_unified_log_process_id
copy_from: macos.process.id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we need to store this sort of information twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First it is stored in custom mapping, secondly in ecs field., Same for below two fields as well.

- set:
field: process.thread.id
tag: set_process_thread_id_from_unified_log_thread_id
copy_from: macos.thread_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

- set:
field: '@timestamp'
tag: set_@timestamp_from_unified_log_timestamp
copy_from: macos.timestamp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a third one?

type: string
ignore_missing: true
- append:
field: user.id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not supposed to be an array, its a single keyword representing the most related user of the event, I see related.user is also there, which is where it should end up if its more than 1.

external: ecs
- name: event.module
type: constant_keyword
external: ecs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are setting the type and the value yourself then its most likely not external? Unsure if that might overwrite it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason for using external is to retrieve definitions from the ECS. Specifying a type and value will override those fields, but the description will still be sourced from the external reference.

type: long
- name: home_directory_path
type: keyword
- name: hostname
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to double map all these ECS fields to both have the custom data and the ECS fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is common practice to have both mapped in all integrations, hence followed the same here as well.

pattern_definitions:
GREEDYMULTILINE: '(.|\n)*'
patterns:
- '^\[%{WORD}\] %{DATA}\:(?:%{SPACE}mach=%{WORD:macos.event.message.mach:boolean})?(?:%{SPACE}listener=%{WORD:macos.event.message.listener:boolean})?(?:%{SPACE}peer=%{WORD:macos.event.message.peer:boolean})?(?:%{SPACE}name=%{GREEDYDATA:macos.event.message.name})?'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really the only way we could parse this? I see that its quite unstructured but it seems at least some of this could maybe be put into a custom pattern and reused for many of these?

- '^%{WORD} \[%{DATA}\](?:%{SPACE}flags=\[%{DATA:macos.event.message.flags}\])?(?:%{SPACE}seq=%{DATA:macos.event.message.seq},)?(?:%{SPACE}ack=%{DATA:macos.event.message.ack},)?(?:%{SPACE}win=%{DATA:macos.event.message.win})?(?:%{SPACE}state=%{DATA:macos.event.message.state})?(?:%{SPACE}rcv_nxt=%{DATA:macos.event.message.rcv_nxt},)?(?:snd_una=%{DATA:macos.event.message.snd_una})'
- '^%{WORD} \[%{DATA}\](?:%{SPACE}flags=\[%{DATA:macos.event.message.flags}\])?(?:%{SPACE}seq=%{DATA:macos.event.message.seq},)?(?:%{SPACE}ack=%{DATA:macos.event.message.ack},)?(?:%{SPACE}win=%{DATA:macos.event.message.win})?(?:%{SPACE}state=%{DATA:macos.event.message.state})?(?:%{SPACE}rcv_nxt=%{DATA:macos.event.message.rcv_nxt},)?(?:snd_una=%{DATA:macos.event.message.snd_una})'
- '^nw_protocol_boringssl_signal_connected\(%{NUMBER}\) \[%{DATA:macos.event.message.connection_identifier}\]\[%{DATA}\] TLS connected \[(?:version\(%{DATA:macos.event.message.tls_version}\))?(?:%{SPACE}ciphersuite\(%{DATA:macos.event.message.cipher_suite}\))?(?:%{SPACE}group\(%{DATA:macos.event.message.group}\))?(?:%{SPACE}signature_alg\(%{DATA:macos.event.message.signature_alg}\))?(?:%{SPACE}alpn\(%{DATA:macos.event.message.alpn}\))?(?:%{SPACE}resumed\(%{DATA:macos.event.message.resumed}\))?(?:%{SPACE}offered_ticket\(%{DATA:macos.event.message.offered_ticket}\))?(?:%{SPACE}false_started\(%{DATA:macos.event.message.false_started}\))?(?:%{SPACE}ocsp_received\(%{DATA:macos.event.message.ocsp_received}\))?(?:%{SPACE}sct_received\(%{DATA:macos.event.message.sct_received}\))?(?:%{SPACE}connect_time\(%{DATA:macos.event.message.connection_time}\))?(?:%{SPACE}flight_time\(%{DATA:macos.event.message.flight_time}\))?(?:%{SPACE}rtt\(%{DATA:macos.event.message.rtt}\))?(?:%{SPACE}write_stalls\(%{DATA:macos.event.message.write_stalls:int}\))?(?:%{SPACE}read_stalls\(%{DATA:macos.event.message.read_stalls:int}\))?(?:%{SPACE}pake\(%{DATA:macos.event.message.pake}\))?\]'
- '^Task \<%{DATA:macos.event.message.task_uid}\>.\<%{NUMBER}\>%{SPACE}summary for %{DATA} \{(?:transaction_duration_ms=%{NUMBER:macos.event.message.transaction_duration_ms:int},)?(?:%{SPACE}response_status=%{NUMBER:macos.event.message.response_status:int},)?(?:%{SPACE}connection=%{NUMBER:macos.event.message.connection:int},)?(?:%{SPACE}protocol=%{DATA:macos.event.message.protocol},)?(?:%{SPACE}domain_lookup_duration_ms=%{NUMBER:macos.event.message.domain_lookup_duration_ms:int},)?(?:%{SPACE}connect_duration_ms=%{NUMBER:macos.event.message.connection_duration_ms:int},)?(?:%{SPACE}secure_connection_duration_ms=%{NUMBER:macos.event.message.secure_connection_duration_ms:int},)?(?:%{SPACE}private_relay=%{WORD:macos.event.message.private_relay:boolean},)?(?:%{SPACE}request_start_ms=%{NUMBER:macos.event.message.request_start_ms:int},)?(?:%{SPACE}request_duration_ms=%{NUMBER:macos.event.message.request_duration_ms:int},)?(?:%{SPACE}response_start_ms=%{NUMBER:macos.event.message.response_start_ms:int},)?(?:%{SPACE}response_duration_ms=%{NUMBER:macos.event.message.response_duration_ms:int},)?(?:%{SPACE}request_bytes=%{NUMBER:macos.event.message.request_bytes:long},)?(?:%{SPACE}response_bytes=%{NUMBER:macos.event.message.response_bytes:long},)?(?:%{SPACE}cache_hit=%{WORD:macos.event.message.cache_hit:boolean})?\}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was like 1 in a few million but I did recently observe the connect_duration_ms value being larger then a valid int. Would it be worth updating the durations here to be long?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing int with long, this will handle the above scenario.

@P1llus
Copy link
Member

P1llus commented Nov 26, 2025

@muskan-agarwal26 you mentioned in one of the PR comments about this:

Removing event.type from advanced_monitoring pipeline, as we haven't mapped event.category.
So matching isn't needed here.

I am a bit concerned on that part because every security event should have both event.category and event.type.
Usually we would want to have as many events as possible with the most accurate event.type and category, but if this is not feasible then each datatstream needs to have a fallback that matches the datatstream, for example "host" category and "info" type.

These two are usually mandatory with a few exceptions as both UI elements and many many features in security solutions expects these to be filled.
If you want we can try to take a try to map these out together if needed.

@muskan-agarwal26
Copy link
Contributor Author

Hi @P1llus , I have removed mapping in advaned_monitoring as there is no specifically one type of event that will be fetched in this. There can be multiple types which we are not known of.
Still if we need to perform mapping, let's do together.

@muskan-agarwal26 muskan-agarwal26 closed this by deleting the head repository Nov 27, 2025
@mergify
Copy link
Contributor

mergify bot commented Dec 1, 2025

⚠️ The sha of the head commit of this PR conflicts with #16170. Mergify cannot evaluate rules on this PR. ⚠️

1 similar comment
@mergify
Copy link
Contributor

mergify bot commented Dec 1, 2025

⚠️ The sha of the head commit of this PR conflicts with #16170. Mergify cannot evaluate rules on this PR. ⚠️

@muskan-agarwal26
Copy link
Contributor Author

Hi @cpascale43 , @narph , @nfritts , @leandrojmp , @P1llus , @marc-gr , @btrieger
Raised another PR: #16170 for this integration as the fork earlier was removed unintentionally while organizing repositories.
Hence this PR was closed.
Please consider the new one for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Crest Contributions from Crest developement team. dashboard Relates to a Kibana dashboard bug, enhancement, or modification. documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. New Integration Issue or pull request for creating a new integration package.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants